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AN  URN  MODEL  FOR  THE  MULTI-SAMPLE 
CAPTURE/RECAPTURE  SEQUENTIAL  TAGGING  PROCESS 


Jos6  Galvao  Leite  and  Carlos  Alberto  de  Bragan?a  Pereira 

IME-Universidade  de  Sao  Paulo,  CP  20570,  CEP  01498,  SP,  Brazil 

Key  words  and  Phrases :CMRR  sampling  process;  capture /  recapture  sequential 
sampling  process;  random  allocation;  allocation  process;  sufficient  statistic. 

ABSTRACT 

The  probability  distribution  associated  with  the  multisample  CMRR  generalized 
sequential  sampling  process  are  obtained  by  using  an  analogy  with  a  single  urn 
model.  Some  statistical  features  are  also  discussed. 

^ - "^^Introduction 

/ 

vJThe  Capture/Marc/Release/Recapture  (CMRR)  sampling  process  is  used 
whenever  informative  data  must  be  obtained  in  order  to  estimate  the  unknown  size, 
N,  of  a  finite  (and  closed)  population.  The  sampling  design  for  such  process  is 
described  here.  ’  r  *  A  >  s  /  <  ,.xri 

^Consider  a  population  of  finite  size,  N,  such  that  during  the  study  time  it 
changes  neither  in  size  nor  in  form;  that  is,  the  population  is  closed  during  the 
study  time.  From  this  population,  k  (k  is  fixed  and  £2)  random  samples  (without 

I 

replacement)  are  sequentially  selected  in  the  following  manner 
The  first  random  sample  of  (fixed)  size  mi  (>1)  is  drawn,  without  replacement. 
After  the  sample  units  are  marked  and  the  number  mi=Ui  is  recorded  they  are 
returned  to  the  population  before  the  second  sample  is  drawn.  Next,  for  each  j 
(£2),  the  j*  random  sample  of  (fixed)  size  mj  (i>l)  is  drawn,  without  replacement. 
The  sample  units  marked  in  earlier  selected  samples  are  immediately  returned  to  the 
population.  The  remaining  Uj  unmarked  sample  units  are  returned  after  being 


marked.  The  numbers  mj  and  Uj  are  recorded.  After  the  k  samples  have  been 
obtained,  the  data 

Dk  =(U,,...,Uk) 

is  observed.  Note  that  the  number  of  distinct  population  units  selected  in  the  whole 
sample  process  is 

Tk=  Uj+...+Uk  . 

The  objective  of  the  present  paper  is  to  obtain  the  probability  laws  of  Dk  and  Tk 
by  using  an  equivalent  um  model.  By  urn  model  we  mean  random  allocations  of 
balls  to  urns. 

The  CMRR  sampling  scheme  has  a  long  reference  list  (see  Seber,  1986)  which 
starts  with  Craig  (1953)  and  Goodman  (1953),  although,  a  related  problem  was 
described  earlier  by  Good  (1950,  p.73).  The  majority  of  the  papers  [viz.Samuel 
(1968)  and  Sen  (1982),  among  others]  consider  only  the  one-by-one  case  (i.e., 
mi=...=mk=l)  and  none  of  them  presents  the  probability  law  of  Dk,  the  raw  data. 
We  believe  that  these  restrictions  are  in  fact  necessary  when  difference  equations 
(the  tool  of  many  authors)  are  to  be  used  to  obtain  these  laws.  The  distribution  of 
Tk,  for  the  general  case  of  mj  different  from  1  for  some  j,  is  described  in  Johnson 
&  Kotz  (1977,  Section  5.3)  where  an  analogy  with  the  committee  problem  is  used. 
Also,  in  this  text,  no  reference  to  Dk  is  made.  In  fact,  for  inferences  about  N,  it  is 
enough  to  consider  only  Tk  since  it  is  a  sufficient  statistic  for  N  in  relation  to  Dk,  as 
show  in  Section  3.  Note  also  that  Tk  and  N  are  both  positive  integer  numbers  while 
Dk  is  a  non-negative  integer  vector  of  order  k.  We  end  this  section  noticing  that  the 
sequence  (Ui)i>i  is  not  an  exchangeable  sequence  which  implies  that  it  is  not  a 
sequence  of  conditionally  independent  and  identically  distributed  random  variables. 
Hence,  Tk  is  sufficient  in  the  broad  sense.  That  is,  the  conditional  distribution  of 
Dk  given  Tk  is  the  same  for  every  possible  N. 


Consider  an  imaginary  one-to-one  correspondence  between  population  units 
and  urns;  that  is,  a  different  um  is  assigned  to  each  one  of  the  N  population  units. 
Also  consider  m=mi+...+mk  balls  numbered  in  the  following  way:  mj  with  the 
number  one,  m2  with  the  number  two,  and  so  on  up  to  mk  with  the  number  k. 
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To  select,  without  replacement,  mi  population  units  to  be  marked  corresponds 
to  randomly  allocating  to  the  urns  the  mi  one-numbered  balls,  in  such  a  way  that  no 
urn  receives  more  than  one  of  these  balls.  To  select,  without  replacement,  the. 
second  sample  of  m2  population  units  corresponds  to  randomly  allocating  to  the 
urns  the  m2  two-numbered  balls,  in  such  a  way  that  no  um  receives  more  than  one 
of  these  balls.  To  count  the  number  U2  of  unmarked  sample  units  (to  be  marked)  is 
equivalent  to  counting  the  urns,  among  the  m2  ones  that  received  the  two-numbered 
balls,  with  only  one  ball.  Sequentially  following  this  analogy,  consider  the  jth 
sample  (j>l).  To  select,  without  replacement,  the  j*  sample  of  mj  population  units 
corresponds  to  randomly  allocating  to  the  urns  the  mj  j-numbered  balls,  in  such  a 
way  that  no  um  receives  more  than  one  of  these  balls.  To  count  the  number  Uj  of 
unmarked  sample  units  (to  be  marked)  is  equivalent  to  counting,  among  the  mj  urns 
that  receive  the  j-numbered  balls,  the  ones  with  only  one  ball.  (Note  that  at  the  end 
of  this  allocation  process,  it  may  happen  that  many  urns  are  empty,  some  have  only 
one  ball,  and  so  on  up  to  a  very  few  having  k  balls.) 

Following  the  above  analogy,  in  the  remaining  part  of  the  present  paper,  the 
vector  Dk=(U1,...,Uk)  represents  indifferently  either  the  data  obtained  by  the 
CMRR  scheme  described  in  Section  1  or  the  data  obtained  by  the  um  scheme 
described  above.  Before  presenting  the  probabilities  of  interest,  we  introduce  the 
notation  used. 

As  usual  the  indicator  function  of  a  set  A  is  represented  by  Ia(x).  Also,  let 
N*=  {0,1,...}  be  the  set  of  non-negative  integers. 

In  general,  for  j£l,  the  random  vector  Dj=(U„...,Uj)  has  its  observed  vector 
represented  by  dj=(ui,...,uj).  Analogously,  for  Tj=Ui+...+Uj,  we  have  tj= 
ui+...+uj.  Since  the  population  size,  N,  is  unknown,  it  is  convenient  to  use  the 
notation  P{Dj=djlN=n}  and  P{Tj=tjlN=n}  for  the  probabilities  of  Djand  Tj, 
respectively.  The  reason  for  this  is  the  fact  that  the  range  of  Tj  (of  N)  depends 
strongly  on  the  unobserved  value  of  N  (observed  value  of  Tj). 

3.  MAIN  RESULTS 

Given  the  um  model  described  in  the  last  section,  the  following  probability 
statements  become  straightforward: 

(i)  Given  mjeN*,  P{Ui=uilN=n)=l,  for  any  n£mj=uj=ti,  otherwise  is  equal  to 
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zero;  and  (ii)  For  j>l  and  mjeN*.  P{Uj=uj  IN=n,Ui=ui,U2=U2,...,Uj.1=uj-i} 

■  C30C&)  O'1 

for  any  n£max{mi,...,mj}  and  max{mi,...,mj}£tj£min{mi+...+mj,n},  otherwise 
is  equal  to  zero. 

The  only  difficulty  one  may  have  in  understanding  the  above  statements  is  with 
the  restrictions  of  n  and  tj  given  in  (ii).  Note  however  that  to  assign  mj  (j^l)  balls 
to  mj  distinct  urns  one  must  have  n^mj  for  all  j^l.  On  the  other  hand,  since  tj  is 
the  number  of  distinct  chosen  urns  up  to  the  j*  stage,  it  must  not  be  smaller  than  the 
number  of  distinct  urns  chosen  in  any  stage.  Also  tj  can  neither  be  greater  than  the 
total  number  of  urns,  n,  nor  than  the  maximum  possible  number  of  distinct  ums  up 
to  the  j*  stage,  mi+...+mj.  Finally,  it  is  not  difficult  to  conclude  that  the  sequence 
(Tk)k>i  is  a  very  interesting  Markov  Chain  (given  {N=n}).  In  fact,  it  is  a 
submartingale  since,  for  j>l, 

E{TjlN=n,Tj.i=t}=^l-i)mj+t. 


(Sen  ,1982;  (2.3),  introduced  a  related  property  for  the  one-by-one  case.) 

The  following  important  result  is  a  direct  consequence  of  these  probability 

statements.  Recall  that  m=mi+...+mk,  ui=ti=mi,  dk=(ui . uk),  tj=ui+...+uj, 

and  Uj€  {0,1,. ...mj},  for  j=2,...,k. 

3.1  Theorem:  For  all  k£2  and  n€  N*  such  that  n£max{mi,...,mk}, 


P  { Dj=dj  IN=n}  = 


IB(tk)  , 


where  B={xeN*;  max{mi,...,mk}<x<min{m,n) }. 


The  proof  of  this  result  is  very  simple.  To  obtain  the  joint  distribution  of  Uj, 
U2,...,  and  Uk  (the  distribution  of  Dk),  we  need  only  to  consider  the  product  of  the 
conditional  probabilities  introduced  by  (i)  and  (ii)  above. 

The  following  lemma  is  a  generalization  of  a  result  described  by  Feller  (1968), 
where  the  case  of  mi=...=mk=l  is  considered.  In  fact  it  indirectly  introduces  the 
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distribution  of  Tk.  Let  Pe{mi,...,mk;n}  represent  the  probability  that,  at  the  end  of 
the  allocation  process,  exactly  e  (eN* )  urns  are  empty. 

3.2  Lemma:  For  all  k£l  and  neN*  such  that  n^max{mi,...,mk). 


( n \  M  k 

Pe(m, . I(-l)‘(T)n(T)«E(e)  . 

n(”)i=0 

where  E={xeN*;  n-min{m,n}£x£n-max{mi,...,mk} }. 

Proof:  For  i=l,...,n,  let  Aj  be  the  event  "the  i*  urn  is  empty  at  the  end  of  the 
allocation  process."  Hence,  for  l<ki<...<kj<n,  P{  Akln...nAki  |N=n} 

=nc;){(;)}"  • 

k 

On  the  other  hand,  P{AiU...uAnIN=n}  =  £(-l)i‘1XiP{Akln...r>Akj  |N=n}, 

i=i 

where  X,  indicates  the  sum  over  the  set  {(ki,...,kj};l£ki^...<ki<n}  which  is 

composed  by  (?)  points.  We  can  then  conclude  that  Po{m1,...,m  k;n) 

k 

=  l-P(A,u...uAnIN=n)=I(-l)'(J)  n  G;;) {  (^) }  I(n<m)  , 

where  I(n<m)  is  the  indicator  of  n£m.  Replacing  n-e  for  n  in  the  above  expression, 
we  notice  that  Po{  mi,...,mk;  n-e}  fl  number  of  points  favorable 

to  the  event  "exactly  e  fixed  urns  are  empty  at  the  end  of  the  allocation  process." 

Recall  that  the  total  number  of  possible  allocations  of  m  balls  in  n-e  urns  is 
][ 

.  Since,  among  the  n  urns,  there  are  (")  ways  to  choose  e  urns, 

we  finally  have  Pe{mi,...,mk;n} 

=  Po{m1,...,mk;n-e}(")n(^)(mj)!  , 

which  concludes  the  proof.  • 

The  following  result  is  a  direct  consequence  of  the  above  lemma  and  is  the 
main  result  of  this  paper. 
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3.3  Theorem:  For  all  k>l  and  ne  N*  such  that  n>max{mi,...,mk), 


To  prove  this  result  we  only  need  to  note  that  if  t  is  the  number  of  distinct 
nonempty  urns,  then  (n-t)  is  the  number  of  empty  urns.  Hence,  a  direct  application 
of  Lemma  3.2  produces  the  desired  result.  Another  consequence,  relevant  for 
statistical  purposes,  is  stated  next. 

3.4  Corollary:  For  inferences  about  N,  the  random  variable  Tkis  a  sufficient 
statistic  with  respect  to  Dk .  The  conditional  probability  of  {Dk=dk}  given  {Tk=t} 
has  the  following  expression: 

P{Dk=dkITk=t}=P{Dk=dkITk=t,  N=n) 


i!(t-i)! 


(Recall  that  the  last  factor  is  the  indicator  of  {Tk=t}.) 


That  Tk  is  a  sufficient  statistic  follows  from  Theorem  3.1  and  the  well-known 
Factorization  Criterion.  Equivalently,  sufficiency  is  also  a  consequence  of  the  fact 
that  the  above  conditional  probability  is  the  same  for  all  possible  values  of  N.  This 
probability  is  directly  obtained  from  the  expressions  introduced  in  Theorem  3. 1  and 
Theorem  3.3. 


4.  COMMENTS  AND  CONCLUSION 

The  factor 

K(n;t)»{(n-t)!n („.)}' 'ri  . 

hat  appears  in  the  probability  expressions  of  Dk  and  Tk,  is  called  the  likelihood 
kernel  since  it  is  the  smallest  factor  of  these  expressions  that  depend  on  the  value 
of  n,  with  the  remaining  ones  independent  of  n.  To  obtain  maximum  likelihood 
estimates  and  to  perform  Bayesian  analysis,  this  kernel  is  the  only  sample  entity  that 
must  be  considered.  In  Leite  (1986)  these  statistical  methods  are  discussed  in 
detail. 
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Finally,  notice  that  another  kind  of  data  could  be  produced  by  the  urn  model 
described  above.  For  instance,  consider  the  vector  (Xo,Xi,...,Xk),  where  Xj  (0<i 
^k)  is  the  number  of  urns  with  exactly  i  balls  at  the  end  of  the  allocation  process.  In 
terms  of  population  units,  X,  is  the  number  of  individuals  captured  exactly  i  times. 
Recall  that  Tk=Xi+...+Xk  and  Xo=N-Tk.  With  respect  to  these  data,  is  Tk  still  a 
sufficient  statistic?  The  answer  is  again  yes.  Clearly,  after  the  value  t  of  Tk 
has  been  recorded,  all  kinds  of  nonempty  urns  must  be  among  these  t, 
independently  of  any  possible  particular  value  N  may  assume.  Hence,  Tk  must  be 
sufficient.  To  formalize  this  conclusion  we  state  the  following  result,  the  proof  of 
which  we  shall  omit  since  it  would  follow  the  same  steps  of  the  ones  presented 
here. 

4.1  Theorem:  For  all  k>2  and  ne  N*  such  that  n>max{mi,...,ink}, 
P{X1=x1,...,Xk=xkIN=n) 

k  .1 

=K(n; t ) {  Jl(mj)!  (xj)! }  h(xi,...,xk)lB(t)  , 

where:  (a)  the  elements  of  (xi,...,xk)  take  values  on  {0,1,.. .Jc}  and  satisfy  the 
equations  xi+2x2+...+kxk=m  and  xi+...+xk=t ;  and  (b)  h(xi,...,xk)  is  the  number 
of  ways  in  which  m  balls  can  randomly  be  allocated  in  t  urns  so  that  xi  urns  receive 
one  ball,  X2  urns  receive  2  balls,  and  so  on  up  to  xk  with  k  balls. 

Here  also,  by  a  direct  application  of  the  factorization  criterion,  we  conclude  that 
Tk  is  sufficient.  To  prove  the  above  result  one  may  need  to  follow  Feller  (1968) 
where  the  one-by-one  case  is  considered. 

We  have  shown  that  up  to  a  particular  stage,  say  k,  the  only  relevant 
information  about  the  unknown  parameter  of  interest,  N,  is  contained  in  T  or 
equivalently  in  the  likelihood  kernel.  If,  in  the  place  of  a  fixed  stopping  step,  k, 
one  considers  a  random  stopping  rule,  the  above  kernel  still  would  be  the  minimum 
sufficient  statistic.  For  example,  analogously  to  the  negative  binomial  rule, 
suppose  that  t  is  fixed  a  priori  and  k  is  the  number  of  steps  required  to  obtain  t.  In 
terms  of  randomness,  k  and  t  would  change  roles;  that  is,  k  would  be  the 
observation  of  a  random  variable  and  t  would  be  the  Fixed  constant.  Hence,  any 
desirable  good  inference  about  N  must  rely  on  a  painstaking  analysis  of  the 
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likelihood  kernel,  K(n;t).  If  a  random  stopping  rule  is  used,  instead  of  CMRR,  the 
sampling  scheme  is  called  Capture/Recapture  sampling  process. 
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